Online Gaussian process for nonstationary speech separation

نویسندگان

  • Hsin-Lung Hsieh
  • Jen-Tzung Chien
چکیده

In a practical speech enhancement system, it is required to enhance speech signals from the mixed signals, which were corrupted due to the nonstationary source signals and mixing conditions. The source voices may be from different moving speakers. The speakers may abruptly appear or disappear and may be permuted continuously. To deal with these scenarios with a varying number of sources, we present a new method for nonstationary speech separation. An online Gaussian process independent component analysis (OLGP-ICA) is developed to characterize the real-time temporal structure in time-varying mixing system and to capture the evolved statistics of independent sources from online observed signals. A variational Bayes algorithm is established to estimate the evolved parameters for dynamic source separation. In the experiments, the proposed OLGP-ICA is compared with other ICA methods and is illustrated to be effective in recovering speech and music signals in a nonstationary speaking environment.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new method for blind source separation of nonstationary signals

Many algorithms for blind source separation have been introduced in the past few years, most of which assume statistically stationary sources. In many applications, such as separation of speech or fading communications signals, the sources are nonstationary. We present a new adaptive algorithm for blind source separation of nonstationary signals which relies only on the nonstationary nature of ...

متن کامل

Use of bimodal coherence to resolve the permutation problem in convolutive BSS

Recent studies show that facial information contained in visual speech can be helpful for the performance enhancement of audio-only blind source separation (BSS) algorithms. Such information is exploited through the statistical characterization of the coherence between the audio and visual speech using, e.g., a Gaussian mixture model (GMM). In this paper, we present three contributions. With th...

متن کامل

Blind separation of instantaneous mixtures of nonstationary sources

Most ICA algorithms are based on a model of stationary sources. This paper considers exploiting the (possible) non-stationarity of the sources to achieve separation. We introduce two objective functions based on the likelihood and on mutual information in a simple Gaussian non stationary model and we show how they can be optimized, off-line or on-line, by simple yet remarkably efficient algorit...

متن کامل

A Bayesian Prediction Approach to Robust Speech Recognition and Online Speaker Adaptation

Because the acoustic environments are uncertain and nonstationary, it is necessary to characterize the uncertainty of speech hidden Markov models (HMM’s) for recognition and trace the uncertainty sequentially to match the nonstationary environments. In this study, we develop a new Bayesian predictive classification (BPC) framework for robust decision and online speaker adaptation. The BPC decis...

متن کامل

Conjugate Gamma Markov Random Fields for Modelling Nonstationary Sources

In modelling nonstationary sources, one possible strategy is to define a latent process of strictly positive variables to model variations in second order statistics of the underlying process. This can be achieved, for example, by passing a Gaussian process through a positive nonlinearity or defining a discrete state Markov chain where each state encodes a certain regime. However, models with s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010